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Abstract 

We consider the decentralized binary hypothesis testing problem in networks with feedback, where 
some or all of the sensors have access to compressed summaries of other sensors' observations. We 
study certain two-message feedback architectures, in which every sensor sends two messages to a 
fusion center, with the second message based on full or partial knowledge of the first messages of 
the other sensors. We also study one-message feedback architectures, in which each sensor sends one 
message to a fusion center, with a group of sensors having full or partial knowledge of the messages 
from the sensors not in that group. Under either a Neyman-Pearson or a Bayesian formulation, we 
show that the asymptotically optimal (in the limit of a large number of sensors) detection performance 
(as quantified by error exponents) does not benefit from the feedback messages, if the fusion center 
remembers all sensor messages. However, feedback can improve the Bayesian detection performance 
in the one-message feedback architecture if the fusion center has limited memory; for that case, we 
determine the corresponding optimal error exponents. 

Index Terms 

Decentralized detection, feedback, error exponent, sensor networks. 



I. Introduction 

In the problem of decentralized detection, introduced by Tenney and Sandell [1], each one 
of several sensors makes an observation and sends a summary by first applying a quantization 
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function to its observation and then communicating the result to a fusion center. The fusion 
center makes a final decision based on all of the sensor messages. The goal is to design the 
sensor quantization functions and the fusion rule so as to minimize a cost function, such as the 
probability of an incorrect final decision. 

In this paper we consider sensor network architectures that are more complex than those 
in (T), and which involve feedback: some or all of the sensors have access to compressed 
summaries of other sensors' observations. We are interested in characterizing the performance 
under different architectures, and, in particular, to determine whether the presence of feedback 
can substantially enhance performance. Because an exact analysis is seemingly intractable, we 
focus on the asymptotic regime, involving a large number of sensors, and quantify performance 
in terms of error exponents. The somewhat unexpected conclusion is that for most of the models 
considered in this paper, feedback does not improve performance in binary hypothesis testing. 
The only exception we have found is Bayesian hypothesis testing in a "daisy-chain architecture" 
(cf. Section |n]) where the fusion center has limited memory. In this configuration, feedback can 
result in a better optimal error exponent. 

A. Related Literature 

The decentralized detection problem has been widely studied for various network architectures, 
including the above described "parallel" configuration of flUl (see [|2l- lfT3l ). tandem networks 
lfl4l - lfT7l . and bounded height tree architectures [TT8l - [|26ll . For sensor observations not condi- 
tionally independent given the hypothesis, the problem of designing the quantization functions 
is known to be NP-hard E71 . For this reason, most of the literature assumes that the sensor 
observations are conditionally independent. (Some works [|28l - [[30ll have considered the case of 
correlated observations under Gaussian models, but without addressing the problem of designing 
optimal quantization functions.) 

Non-tree networks are harder to analyze because the different messages received by a sensor 
are not in general conditionally independent. While some structural properties of optimal decision 
rules are available (see, e.g., OTTO , not much is known about the optimal performance. Networks 
with feedback face the same difficulty, and the relevant literature (discussed in the next paragraph) 
is limited. 

A variety of feedback architectures, under a Bayesian formulation, have been studied in 
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(221, Il32l - These references show that it is person-by-person optimal for every sensor to use 
a likelihood ratio quantizer, with thresholds that depend on the feedback messages. However, 
because of the difficulty of optimizing these thresholds when the number of sensors becomes 
large, it is difficult to analytically compare the performance of networks with and without 
feedback. Numerical examples in [|32l show that a system with feedback has lower probability 
of error, as expected. To better understand the asymptotics of the error probability, 11551 studies 
the error probability decay rate under a Neyman-Pearson formulation for two different feedback 
architectures. For either case, it shows that if the fusion center also has access to the fed back 
messages, then feedback does not improve the optimal error exponent. References ll34l . 11551 
consider the Neyman-Pearson problem in a daisy-chain architecture (see Figure O, and obtain 
a similar result. However, the analogous questions under a Bayesian formulation were left open 
in 11551-11551. 

B. Summary and Contributions 

In this paper, we revisit some of the architectures studied in I1551 - I1551 . and extend the available 
results. We also study certain feedback architectures that have not been studied before. In what 
follows, we describe briefly the architectures that we consider, and summarize our results. 

1) We study a new two-message sequential feedback architecture. Sensors are indexed, and 
the second message of a sensor can take into account the first message of all sensors with 
lower indices. We show that under either the Neyman-Pearson or Bayesian formulation, 
feedback does not improve the error exponent. 

2) We consider the two-message full feedback architecture studied in II551 . Here, each sensor 
gets to transmit two messages, and the second message can take into account the first 
messages of all sensors. We resolve an open problem for the Bayesian formulation, by 
showing that there is no performance gain over the non-feedback case. We also provide a 
variant of the result of [33J for the Neyman-Pearson case. Our model is somewhat more 
general than that in [1551 . because we do not restrict the sensors' raw observations and the 
sensor messages to be finitely-valued. More crucially, we also remove the constraint in [1551 
that the feedback message alphabet can grow at most subexponentially with the number of 
sensors. 
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3) We consider the one-message sequential feedback architecture studied in |[36l, (37l| (under 
the name of "full observation network topology"), where sensors are indexed, and each 
sensor knows the messages of all sensors with lower indices. Unlike 113611 , 11371 , which 
investigate "myopic" strategies where each sensor selfishly minimizes its local error prob- 
ability, we show that if there is cooperation amongst sensors so that the last sensor makes 
the final decision for the whole network, there is no loss of asymptotic optimality if sensors 
other than the last ignore information from the other sensors, for both the Neyman-Pearson 
and the Bayesian formulation. 

4) We consider the daisy chain or one-message architectures studied in [1341 . under which the 
sensors are divided into two groups, and sensors in the second group have full or partial 
knowledge of the messages sent by the first group. Reference 11341 dealt with the Neyman- 
Pearson formulation. In this paper, we turn to the Bayesian formulation and resolve several 
questions that had been left open. 

a) In a full feedback daisy chain, sensors in the second group, as well as the fusion center, 
have access to all messages sent by sensors in the first group. Similar to the Neyman- 
Pearson case, we show that the Bayesian optimal error exponent is the same as for a 
parallel configuration with the same number of sensors; in particular, feedback offers no 
performance improvement. 

b) In a restricted feedback daisy chain, the second group of sensors, as well as the fusion 
center, have access to only a 1-bit summary of the messages sent by sensors in the first 
group. For the Neyman-Pearson formulation, ll35l shows that feedback does not improve 
the error exponent. In contrast, for the Bayesian formulation, we show that in general, 
feeding this 1-bit summary to the second group of sensors can improve the detection 
performance. We provide sufficient conditions for feedback to result in no performance 
gain. Furthermore, we show that this architecture is strictly inferior to the full feedback 
daisy chain and the parallel configuration. We also provide a characterization of the 
optimal error exponent. 

The remainder of the paper is organized as follows. In Section [XT] we define the model, 
formulate the problems that we will be studying, and provide some background material. In 
Section [Till we study two-message feedback architectures (sequential and full feedback). In 
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Section [IV] we study one-message feedback architectures. We offer concluding remarks and 
discuss open problems in Section |V] Some mathematical results that we use frequently are 
presented in the Appendix. 

II. Problem Formulation 

In this section, we describe the feedback architectures of interest, define our model, and present 
some preliminary results. We consider a decentralized binary detection problem involving n 
sensors and a fusion center. Each sensor k observes a random variable X k taking values in some 
measurable space (X, J 7 ), and is distributed according to a measure Fj under hypothesis Hj, 
for j = 0, 1. Under either hypothesis Hj, j = 0, 1, the random variables X k are assumed to be 
i.i.d. We use Ej to denote the expectation operator with respect to (w.r.t.) Fj, and X™ to denote 
the vector . . . ,X n ). A similar notation, e.g., F" will be used for other vectors of random 
variables as well. 

Let T be the set from which messages take their values. In most engineering applications, T 
is assumed to be a finite alphabet, although we do not require this restriction. This allows us to 
model the received messages at the fusion center over noisy channels. Furthermore, we use T 
to denote the set of allowed quantization functions, that is, functions 7 : X 1— > T, that can be 
used to map observations to messages. One possible choice is to let T consist of all measurable 
functions. Alternatively, for the problems considered in this paper, it is known that for T finite, 
there is no loss of optimality if we let T be the set of likelihood-ratio quantizers 11221 . OTI . [|32l . 

We consider two classes of feedback architectures: the two-message and one-message archi- 
tectures. 

A. Two-Message Feedback Architectures 

In two-message feedback architectures (see Figure CO), each sensor k sends a message Y k = 
7fc(^fc)> with 7^ e T, which is a "quantized" version of its observation X k , to the fusion center. 

We assume that the sensors are indexed in the order that they send their messages to the 
fusion center. We consider three forms of feedback under the two-message architecture, 
(a) Sequential feedback. Here, for k = 2,...,n, the feedback message sent by the fusion 
center to sensor k is W k = (Yi, . . . , Y k _i), the vector of messages generated by the previous 
sensors. 
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Fig. 1. A two-message architecture. 

(b) Full feedback. The feedback message sent by the fusion center to sensor k is the vector 
Wk = {Yi~ X , Y k n +1 ) °f messages generated by all of the other sensors. 

(c) Restricted feedback. The feedback message sent by the fusion center to sensor A; is a 
function W k = f k {Yf-\ Y k n +1 ) of the other sensors' first messages, whose alphabet does 
not increase with the number of sensors. 

In all of the above scenarios, each sensor forms a new, second message Z k = S k (X k ,W k ) 
based on the additional information W k , and sends it to the fusion center. 

For simplicity, we assume that Z k takes values in the same alphabet T and, furthermore, that 
for any w, the function Sf(-) = Sk(-,w) is constrained to belong to the same set T that applies 
to the first round. As alluded to earlier, when T is finite, it is known that there is no loss of 
optimality if we restrict to log-likelihood ratio quantizers of X k , with thresholds that depend on 
the received messages. 

Finally, the fusion center makes a decision Yf = 7j(Y L n , Zf). Here, we assume that the fusion 
center always remembers the first messages Y\,..., Y n . The collection (7/, 71, ... , 7 n , <5i, . . . , 5 n ) 
is called a strategy. A sequence of strategies, one for each value of n, is called a strategy sequence. 
We wish to design strategy sequences that that are asymptotically optimal (in the sense of error 
exponents), as n increases to infinity. 
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B. One-Message Feedback Architectures 

In one-message architectures, every sensor sends a single message to an intermediate aggre- 
gator or the fusion center, but some of the sensors have access to the messages of some other 
sensors. Specifically, we consider a one-message sequential feedback architecture 11361 . 11371 . and 
a daisy chain architecture ll34l . |[35l . As before, we let T be the set of allowed quantization 
functions. 

(a) One-message sequential feedback. Here, sensor k has access to the messages Fi, . . . , Y k ^i 

of all sensors with lower indices. Sensor k forms a message Y k = 7fc(X fe , Y^ 1 ), and 
broadcasts it to all sensors with higher indices. The last sensor, n, makes a final decision 
and plays the role of a fusion center. We assume that for any Y{ , the mapping from X k 
to Y k belongs to Y. 

(b) Daisy chain. This architecture consists of two stages (see Figure El) with the first stage 
involving m sensors and the second n — m. Each sensor k in the first stage sends a message 
Y k = lk{X k ) to an aggregator, with 7 fc £ Y. The aggregator forms a message U that is 
broadcast to all sensors in the second stage and to the fusion center. Each sensor I in 
the second stage forms a message Z\ = 5f (Xi) = 5i(Xi,U), which depends on its own 
observation and the message U . Again, we assume that 5f £ Y, for every possible value u of 
U. The fusion center makes a final decision using a fusion rule Yf = 7/(f7, Z m+1 , . . . , Z n ). 
We can view the daisy chain as a parallel configuration, in which the fusion center feeds 
sensors m + 1, . . . , n with a message based on information from sensors 1, . . . , m. 

We consider two cases for how U is formed. 

(i) Full feedback daisy chain. Here, we let U = (Yi,...,Y m ), i.e., the second stage 
sensors and fusion center have the full information available at the first stage aggregator. 

(ii) Restricted feedback daisy chain. Here, we let U = 7 w (Yi, . . . , Y m ) £ {0,1}. This 
architecture can be viewed as a parallel configuration in which the fusion center makes 
a preliminary decision based on the messages from the first m sensors, broadcasts 
the preliminary decision, and forgets (e.g., due to memory or security constraints) the 
messages sent by the first m sensors. 



September 1, 2011 



DRAFT 



8 




Fig. 2. The daisy chain architecture. 



C. Assumptions and Preliminaries 

In this section, we list the basic assumptions that we will be making throughout this paper, 
and note a useful consequence that will be used in our subsequent proofs. 

Let Ff be the distribution of a random variable X under hypothesis Hi. Consider the Radon- 
Nikodym derivative dPf / dP^ of the measure Ff with respect to the measure P^ . Informally, 
this is the likelihood ratio associated with an observation of X, and is a random variable 
whose value is determined by X; accordingly, its value should be denoted by a notation such 
as IfAX), where is a function from X into [0, oo) determined by the distributions of X 
under the two hypotheses. However, in order to avoid cluttered expressions, we will abuse 
notation and just write £ij(X). Furthermore, to simplify notation, we use £ij(X, Y) in place 
of £ij((X, Y)), and similarly for random vectors of arbitrary length. We also use ^(^(X)) to 
denote the Radon-Nikodym derivative of the random variable Z = 'j(X). Throughout the paper, 
we deal with various conditional distributions. Abusing notation as before, we let £ij(X\Y) be 
the Radon-Nikodym derivative of the conditional distribution of X given Y. Other notations like 
£y(7pT)|y) will also be used. 

We make the following assumptions. The first assumption results in no loss of generality (see 
Il38l0 . The second assumption is made to simplify the exposition, and can often be relaxed. See 
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[|39l for a discussion. 
Assumption 1: The measures P and Pi are absolutely continuous w.r.t. each other. 
Assumption 2: We have Ej [log^j^Xi)] < oo for i,j = 0, lQ 

Assumption [2] implies the following lemma, the proof of which follows from Proposition IA.1I 
in Appendix |A] of 091 . 

Lemma 1: There exists some finite constant a, such that for all 7 G T, and i,j = 0, 1, 

E, [log^-iMXi))] < E 4 [log^pTO] +Ka, 
EJIlog^CT^))!] <a. 

III. Two-Message Architectures 

In this section, we study the Neyman-Pearson and Bayesian formulations of the decentralized 
detection problem in two-message architectures. For i,j £ {0, 1}, we consider the log-likelihood 
ratio at the fusion center, a random variable denoted by £,•" . We have 

4 n) = 1 °g^( F i n ^r) 

= ^log^(F fe ) + log^(Z 1 "|F") 
fc=i 

n n 

= % (n) + £ log % (z* 1 F™) 
/c=i /c=i 

= ^log^(F fe ) + ^log^(Z fc I Y k ,W k ). 

k=l k=l 

The second equality above holds because, under either hypothesis, and given F/\ the random 
variables Z k are functions of the respective X k ; thus, the Z k are conditionally independent, given 
Y*. The last equality holds because Z k depends on F™ through Y k and W k . 

To simplify notation, we define, for every possible value w of W k , a random variable C^(w), 
according to 

4H = log* y (Y fc ) + log^(Z fe I Y k , W k = w) 
= log£ ij ( lk (X k ),5 k v (X k )). 

'For the Neyman-Pearson formulation, we will only require that this assumption and Lemma [TJ hold for i — 0, j — 1. 
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Note that C^(w) is a random variable which is a function of a non-random argument w and the 
random variable X k . Note also that 

n 

fc=i 

A. Neyman-Pearson Formulation 

Let a E (0, 1) be a given constant. A strategy is admissible if its Type I error probability 
satisfies Po(^/ = 1) < a. Let /3* = inf Pi(Y} = 0), where the infimum is taken over all 
admissible strategies for the n-sensor problem. Our objective is to characterize the optimal error 
exponent limsup n ^ 00 (l/n) log/3*, under different feedback architectures. 

Let g\ be the optimal error exponent for the two-message parallel configuration, in which 
there is no feedback from the fusion center, i.e., when each sensor k sends two messages, 
(jk(Xk), 5k(X k )), to the fusion center. From 11391 . the optimal error exponent is 

g* 2p = inf E [logM7(*i)^(*i))]- 
(7,<5)er 2 

Let g*p g*f, and g*j be the optimal error exponents for the sequential, full, and restricted feedback 
architectures respectively. Since the sensors can ignore some or all of the feedback messages 
from the fusion center, we have 

9} < g* sf < g* 2p , (1) 
9} < g* rf < g* 2p , (2) 

(Note that error exponents are nonpositive and that smaller error exponents correspond to better 
performance.) 

We will show that under appropriate but mild assumptions, the inequalities in (0Q) and © 
are equalities. Hence, from an asymptotic viewpoint, feedback results in no gain in detection 
performance. We first show a useful result that underlies a key step in our proofs. 

Lemma 2: Consider a sequence of strategies, indexed by the number n of sensors, and let f3 n 
be the associated Type II error probabilities. Let R be a nonnegative constant. If 

limsupP (/:So ) < ~ nR ) < 1 - «, 

n—^oo 

then 

liminf — log/3 n > —R. 

n— >oo n 
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Proof: We have 

j g n = p 1 (y / = o) 

= E exp(£^ ) )l { y /= o} 

>E [eM£w)\ Yf=0>c $>-nR}] 
> e- nR F (Y f = 0,4? > -"^)- 

Therefore, 

P (F f = 0,4 n ) >-n J R)</3 n e^. 

This upper bound yields 

1 - a < P (Y) = 0) < f3 n e nR + P (4? < -nR), 

and 

-\ og p n + R>- log(l - a - P (/:So ) < -ni?)). 
n n 

The desired result follows by taking the limit as n — > oo. ■ 

5. Neyman-Pearson Formulation — Sequential Feedback 

For the case of sequential feedback, the proof that feedback yields no performance improve- 
ment is relatively simple. The core of the proof is an inequality on the (conditional) expectation 
of the log-likelihood ratio at the fusion center. We use this inequality together with a variance 
bound to obtain a bound on the tail probabilities associated with the log-likelihood ratio, and 
finally use Lemma [2l 

Theorem 1: Suppose that Assumptions [Ul2] hold. Then, the optimal error exponent for the 
sequential feedback architecture is g*f = g\ p . Moreover, there is no loss in optimality if the 
sensors ignore the feedback messages from the fusion center and are constrained to using the 
same quantization function. 
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Proof: From ©, we have g* sf < g\ v . To show the reverse inequality, we first bound 
E [£i (W fc ) | W k ] from below by g% p . We have, for any w, 

E [C k w (W k ) | W k = w] = E [log4o(7*(**),<5fcW) I 



w 



> inf Eo [log ^0(7(^1), <5(Xx))] 
(7,5)Gr 2 



92V 



(3) 



C 



(n) 
'7 



> 



In particular, E [£j (W*)] > 9* 2p and E 

We next obtain a suitable variance bound. Let Q k = C\ Q {W k ) — E [£i (W / / c ) | W k ]. From 
Lemma [H there exists some constant a > such that 



var (Q fc ) < E [e [(C k w (W k )) 2 | W k 
< a. 



(4) 



Recall that W k = Y± \ We have, for m < k, 



E [Q m ■ Q fc ] 
= E [Q m E [Q k I WjJ] 
= 0. 



(5) 



Let e > 0. Inequality ©, together with the bounds © and ©, and Chebyshev's inequality, 
yield 



P (4?<n(l + e)^) <P 



,fe=i 



< 



2f„* ^2' 



Letting n — > 00, we get 



lim P (4? < n(l + e)g* p ) = < 1 - a. 

moo \ / 



Therefore, applying Lemma [21 we have g** > (1 + e)^- Since e was chosen arbitrarily, we 
obtain g* f > g% p , and the proof is complete. ■ 
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C. Neyman-Pearson Formulation — Full Feedback 

Next, we consider the full feedback architecture. The same architecture has been studied in 
11331 . using the method of types, and under a more restrictive set of assumptions. In the following, 
we show a result similar to the one in ||33~1 . i.e., that there is no gain from the feedback messages 
asymptotically. For a comparison, we note that ll33l involved a constraint that feedback messages 
take values in an alphabet that grows at most subexponentially. This constraint excludes the full 
feedback case, in which the feeback messages W k take values in an exponentially growing 
alphabet. 

The following result subsumes, in some sense Theorem [Q indeed, if full feedback cannot 
improve performance, then sequential feedback cannot either. On the other hand, for this more 
general result we will need a stronger assumption. In Theorem [H we used the property that 
the "innovations" C^(W m ) — E [£^(W / m ) | W m ] were uncorrected, which allowed us to use 
Chebyshev's inequality. Such a property is no longer true in the full feedback case. Instead, we 
impose an exponential tail bound on the original log-likelihood ratios; equivalently, we make a 
finiteness assumption on the log moment generating function of the original log-likelihood ratios 
about a neighborhood of the origin, which is standard in the theory of large deviations [|40l . 
We then proceed to derive related bounds that refer to the log-likelihood ratios associated with 
various messages. This step is somewhat tedious but unsurprising. 

We define b(s) = logE [(£ 10 (X!)) s ], which is the log moment generating function of the 
log-likelihood ratio logi^Xx)). 

Assumption 3: There exists some s < such that b(s) < oo. 

Since b(-) is nonincreasing on [s, 0] (cf. Lemma 2.2.5 of iffOlO . Assumption [3] implies that 
b(s) < oo for all s E [s, 0]. Furthermore, the second moment of logi^X^ under H exists and 
is finite. Therefore, Assumption [3] implies Assumption [2] for i = and j = I. 

Consider a pair £ = (7,5) G T 2 of quantization functions. Let y^(s) = logE [(^(^(X^Y] 
be the log moment generating function of the log-likelihood ratio of the distribution of £(Xi) = 
(7(Xl), S(Xi)) under H\ versus that under H . Suppose that a strategy sequence has been fixed. 
Let ip n (s) = logE [exp(s£^)] be the log moment generating function of C[q* . 

Based on Assumption |3j we will show some properties of ip^ and ip n . We will then use these 
properties to obtain tail bounds on , which will play the same role as the Chebyshev bound 
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in the proof of Theorem [TJ ^ 

Lemma 3: Suppose Assumption [Ul3] holds, and let s be as in Assumption [3l 

(i) There exists a positive constant c such that for all s G [s/2, 0], and for all £ G T 2 , we have 

< <f'l(s) < c. 

(ii) Let £* G T 2 be such that y^,(0) < g\ v + e, where e is a small positive constant so that 



h = y / e/(2c) < min{|s|/2, 1/4}. Then, for all s G [-h,0], and for all £ G T 2 , we have 

n (s) < n .(s) + e/2. 
(iii) For all n > 1 and s G [— /i, 0], we have ij) n (s)/n < (p^*(s) + e. 

Proof: We first prove claim ©. From Lemma 2.2.5 of 11401 , is a convex function with 
nonnegative second derivatives. We next show that its second derivative is uniformly upper 
bounded for all £ G T 2 . From Lemma [ATI we have 

E [(M£(*i))) S ] < Eo < e^. (6) 

Let f(s) = E [(£io(^(Xi))) s ], and 77 = min{|s|/2, 1}. There exists a positive constant M such 
that for all |a;| > M, we have x 2 < exp(r]x) + exp(— r/x). Making use of this bound, we obtain 

, l( , E [(f 10 (e(x 1 )))Mo g 2 £ 10 (e(x 1 ))] 

^ = EoKMtf*)))"] 

<e ^ 10 (e(^i))) s io g 2 ^o(e(^i))] 

<M 2 /( S ) + /( S + r / ) + /(s-r / ) 

< (M 2 + 2)/(s) 

< (M 2 + 2)e ft(5) . 

The third inequality follows from the bounds s < s + rj < 1 and s < s — 77 < 0, and the facts 
that f(x) is nonincreasing over [s, 0], while f(x) < 1 < f(s) for x G [0, 1]. The final inequality 
follows from ©. Claim (0) is now proved. 

We now use a Taylor series expansion to prove claim dn]). Since y^(O) = for any £ G T 2 , 

2 Throughout the paper, we use /'(s) and f"(s) to denote the first and second derivatives of / w.r.t. s. 
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we have for s G [— h, 0], 



<^( S ) - n *(s) = (^(0) - </4*(0))s + (f^(ai) - ^2))- 



< elsl + c- 



<e/2, 



where si and S2 are between s and 0, and the first inequality follows from (p'g(0) > g% p , ^£*(0) < 
9*2 P + e, <^'0i) < c, and ^'.(s 2 ) > 0. 

Finally, we turn to the proof of claim dTTTb . Recall that Z k = 5^ k (X k ). For s G [— /i, 0], we 
have 



E 



J(£ 10 (Z k \Y k ,W k )Y Y? 



k=l 



E 



,fc=l 



fc=i 



(7) 



<n(E ^io(5 yfc (^)in)) s n 

fe=l 

where ei = eexp(— b(s))/2, and 5 yfc G T is a function depending on the value of Y^, and is such 
that 



E, 



{e 10 {5 Y *(x k )\Y k )y 



From ©, we have 



n n 



logE 



Y k 


> sup E 


(£ w (6(X k )\Y k )) s 


Y k 


- ei 














n 









(MW-Eo 



7(4o(2"*in,w fc ))' 



fc=i 



yn 



< -loeE, 



n 



Y, 



< 



(ho(Yr)r ■ n ( E ° (M* n (**)in)) 
fc=i 

- log n (e [(4o(n, S Y "(X k ))) a ] + ei E 

k=l 

n 

-^logE [(£ 10 ( lk (X k ),5 Y *(X k YY]+l 



fc=i 
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where the last inequality follows from ©, and the inequality log(x + e) < logx + e for x > 1. 
Let & e T 2 such that £ k (X k ) = ( lk (X k ),5 k (X k )), where 5 k (X k ) = 5 u (X k ) iff lk (X k ) = ueT. 
We therefore have 

^<|f;%(.) + 5<M') + e. 
fe=l 

where the second inequality follows from claim (El). The proof is now complete. ■ 
Finally, we show that for both the full and restricted feedback architectures, feedback does 

not improve the optimal error exponent. 

Theorem 2: Suppose that Assumptions [Ul3] hold. Then, in both the full and restricted feedback 

architectures, there is no loss in optimality if sensors ignore the feedback messages from the 

fusion center, i.e., gj = g*j = g\ v . Moreover, there is no loss in optimality if all sensors are 

constrained to using the same quantization function. 

Proof: From ©, it suffices to show > g% p . Choose a sufficiently small e > 0. Let £* and 

h be chosen as in Lemma (3JEI1), and let t e = —(ip^(-h) + e)/h. From the Chernoff bound and 

Lemma [3H11I), we have 

lim sup - log P (4? < n ( f e ~ e)) < <p? (-h) + e + h(t e - e) 

= -he < 0. 

Applying Lemma [2l we have gj >t t — e. The Taylor series expansion of <^^» yields 

t e = ^(0)-^*(0)l^ c -V2Ve, 

where 9 G [—h, 0], and c is the same constant as in Lemma[3Jlil). Since < f'^(9) < c, t e — > g\ p 
as e decreases to 0. Letting e — > 0, we obtain the theorem. ■ 

D. Bayesian Formulation 

In this section, we show that feedback does not improve the optimal error exponent for the 
binary Bayesian decentralized detection problem in the sequential, full, and restricted feedback 
architectures. Let the prior probability of hypothesis Hj be ttj > 0, j = 0, 1. Given a strategy, 
the probability of error at the fusion center is P e (n) = iroF (Yf = 1) +7TiPi(l/ = 0). Let P*(n) 
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be the minimum probability of error, over all strategies, for the n-sensor problem. We seek to 
characterize the optimal error exponent 

lim sup — log P^{n). 

n— ¥00 Tl 

From 11391 , the optimal error exponent for the parallel configuration without any feedback is 
given by 

£* p = inf rnin logE [(^0(7(^1), *(Xi))fl. (8) 
(7,5)Gr 2 se[o,i] 

Similar to the Neyman-Pearson formulation, we let S*p £** and £J denote the optimal error 
exponents for the sequential, restricted, and full feedback architectures respectively. Note that 
the counterparts of inequalities © and © also hold for the Bayesian error exponents. Therefore, 
to show that feedback does not improve the asymptotic performance, it suffices to show a lower 
bound for the full feedback architecture. Recall that ip n (s) = logE exp(s£^) is the log 
moment generating function of . The following lemma provides uniform bounds for i/j n and 
its derivatives, over all strategies. 

Lemma 4: Suppose that Assumptions Q] and |2] hold. 

(i) For all s E [0, 1], we have E [log^o^)] < f n (s)/n < E 1 [log^ 10 (^i)]. 

(ii) For any bounded sequence (O and for any given strategy such that there exists s n E (0, 1) 
with ip' n (s n ) = t n for each njj, we have 4>n(s n ) < nC, where C is a constant independent 
of the strategy. 

(iii) For all s E [0, 1], we have tp n (s) > nSL- 

Proof: We first show claim ©. To show the bounds on ip' n {s), we note that ip n is convex, 
so ip' n (0) < ip' n (s) < ip' n (l) for all s E [0, 1]. Using Proposition IA.ll it is then easy to check that 
<(0) > nE [log4o(Xi)] and <(1) < nE 1 [log^o^)]. 
Next, we prove claim dn]). We have 

_ Eo^jo^exp^S?)] _ ( , )Y 
* M ~ Eo[exp( Sn £ff)] ~^ n{Sn)) 

<C 1 E [(/:S ) ) 2 ex P ( Sn 4J)], (9) 

3 Note that the sequence (s n ) depends on the strategy used. 
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where the inequality follows from the bound E [exp(s n £^' ) )] > 1/C\, for some constant C\ 
independent of the strategy. (This fact is proved in Proposition 3 of ll39ll .) The right-hand side 
of © can be upper bounded by observing that 



.(n) 



<4U 



+ Ei 



(n)\2_-(l-s„)£ 



10 



(n) 

10 1 {4S ) >o} 



(10) 



where in the inequality, we use the result that the function fi(x) = x 2 exp(s n x)l{ x < } is 
maximized at —2/s n , and the function f2(x) = x 2 exp(— (1 — s n )x)lf x> o} is maximized at 
2/(1 — s n ) . It now suffices to show that both s n and 1 — s n are at least C2 / y^ri for some constant 
C 2 independent of the particular strategy chosen. To simplify the notation, let £ n = exp(£^). 
Suppose that \t n \ < t for all n. Using the inequalities x s < sx + 1 for < s < 1, and x s > x 
for x < 1, we obtain from the equation ip' n (s n ) = t n \\ 



t n Eo 



/ r( n ) 

exp(s„£' 



10 



= E [(t n ) Sn log l n ] 

= E [(£ n ) s "(log£ n ) + ] - E [(C) s "(log4)~] 

< s n (E [4(log4) + ] +E [(log£ n ) + ]) -E [£ n (\og£ n )-]. 



which yields 



Sn ~ Ex [(log 



Ei [(log 4; 



(11) 



since < E n 



exp(s n £^ 



^]+E [(log Q+Y 

< 1 and < t. We first bound the denominator in ([111) by using 
g(x) = x(logx) + , which is a convex function, and Proposition IA.1I to get 

E 1 [(logC) + ] =®oWn)} 

< E [^(4oW))] 

<E [£ w {X?)\log£ w (X?)\] 

= E : [|log4oW)|] 

^^Eifllog^oMI] 



k=l 



4 We use the notations x + — max(i, 0) and x = — min(a;, 0). 
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<^e x [hg 2 e 10 (x k )} + 



n 



k=l 



<nC, 



(12) 



/(*) 



where C 3 is a constant, and the last inequality follows from Lemma \T\ Similarly, it can be 
shown that E [(log£ n ) + ] = E [(C { ^)~] < Rq[(£$) + ] is bounded by nC 3 . Next, we show a 
lower bound for the numerator in (TTTT) . Let 

x(logx)~, if < x < 1, 
1 — x, if x > 1, 

which is a concave function not greater than x(logx)~. From Proposition IA. 11 we obtain 

Ex [(log 4)1 =E [£ n (\og£ n )~] 

> E [/(4)] 

> E [f(£io(Xm 

> Ei [(log^oW))-] - Pi(M^r) > i) 



1 " 

-= Vlog^ 10 (X, 



Applying Fatou's Lemma and the Central Limit Theorem, we obtain 

El [(log £ n ) 



lim inf 

n— >oo 



> a 



4, 



(13) 



where C 4 is a positive constant. Substituting the bounds (fl"2l) and (TOT ) into (TTTT) . we finally have 

lim inf s n y/n > C2, 
for some positive constant C 2 . A similar proof using 



Ei[£^ } exp((l - s n )£ " j )] = -^o[^' exp(s n C ( ^')} = -t n E exp(s n C 



,(n) 



(n) 



,(n)> 
-10 y 



> -t 



shows that the same bound holds for 1 — s n . Therefore, from (flOl) . claim (Jn]) holds. 

In the following, we establish claim (fTTTT) . Let e be a positive constant. Similar to the proof of 
Lemma (3]JiTil), let 5 Yk G T be a function depending on the value of 1^, so that 



En 



(e 10 (s Yk {x k )\Y k )y 



Y, 



< inf E 

<5er 



(£ w (S(X k )\Y k )Y 



Y, 
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We have 



tf) n (s) = logEc 



(£ 10 (r 1 ")) s -E 



J(£ 10 (Z k \Y k ,W k )Y 



.fe=i 



yn 



> -logEo 



n 



(4oO?)) s ■ II ( E o \(tio(S Yk (X k )\Y k )Y 



k=l 



Y, 



^logEo (£ 10 (Y k )) s (Eq (4o(* y *Wln)) 



fc=i 



> 



£>g (E [(ixoCn,^*^*)))'] -e) 



(14) 



k=l 



where we have used the inequality E \(i w {Y k )) s ] < 1 in (fl4"l) . Recall that Y" fc = lk{X k ). We can 
define £ fe e T 2 such that £ k {X k ) = {-f k {X k ),5 k {X k )), where 4(X fc ) = 5 u {X k ) iff 7 fe (X fc ) = uG 
T. From (fl4"l) . we obtain the bound 



Ms) > ^log(E [(4o(&(**)))'] " e ) 



fc=i 



>nlog inf E [(£ 10 (e(X 1 ))) s ]-e 



(15) 



Since e is arbitrary, the lemma is proved. ■ 
Theorem 3: Suppose that Assumptions [TJ and [2] hold. Then £5 = E% p . Moreover, there is no 
loss in optimality if sensors are constrained to using the same quantization function, which ignore 
the feedback messages from the fusion center. 

Proof: It is clear that £J < S^p- To show the reverse bound, we make use of Proposi- 
tion IA.21 Let the conditional probability of error under Hj be P n j for j = 0,1. Let s* = 
argmin se ( 0i i) ip n (s) so that ip' n (s^) = 0. From Proposition IA.21 we have 

maxP nJ > ^exp (^ n «) - \/2<(s*)) 

> exp(V>„«) - Cy/n) 

> exp(n£2 P — C\/n) 

where C is some constant. The penultimate inequality follows from Lemma HUT]), and the last 
inequality from Lemma l4tHiil). Letting n — Y oo, we have 

lim sup — log P e {n) = lim sup — log max P n j 

n— >oo Tl n— >oo Tl J=0,1 



— c 2p- 
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This implies that > £| p , and the proof is complete. ■ 
Since the sequential and restricted feedback configurations can perform no better than the full 

feedback architecture, and no worse than the parallel configuration, we have the following result. 
Theorem 4: Suppose that Assumptions Q] and [2] hold. Then £** = £*j = Moreover, there 

is no loss in optimality if sensors are constrained to using the same quantization function, which 

ignore the feedback messages from the fusion center. 

IV. One-Message Architectures 

In this section, we consider the one-message architecture. We study both the Neyman-Pearson 
and Bayesian formulations for the binary hypothesis testing problem. Similar to the two-message 
architecture, feedback in general does not improve the asymptotic detection performance, except 
for the case of Bayesian detection with restricted feedback in the daisy chain architecture. In 
the case where there is no feedback [39 J, the optimal Neyman-Pearson error exponent is 

^ p = infE pog4o(7(*i))], 
while the optimal Bayesian error exponent is 

£Z p = m£ min loglo [(4o(7(*i)))']- 

A. Full Information at Fusion Center 

We consider the case where the fusion center has access to all sensor messages. This is the case 
for the sequential feedback architecture in which the fusion center is the last sensor. The same 
applies for the full feedback daisy chain architecture. By ignoring all feedback messages except 
at the fusion center, these architectures are equivalent to the parallel configuration with the same 
number of sensors. Therefore, the optimal error exponents under both the Neyman-Pearson and 
Bayesian formulations are at least as negative as those for the parallel configuration. The proof of 
the reverse direction involves the same steps as in the proofs for the two-message architectures 
in Section [TTTJ Specifically, the proof for the one-message sequential feedback architecture is 
similar to that of Theorem [Q with suitable modifications (remove all references to the first 
messages 7 fc and replace Y k by Z k ). The proof for the daisy-chain architecture corresponds to 
that of Theorems [2] and The result for the daisy-chain architecture under the Neyman-Pearson 
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formulation is also provided in ||34l . The above discussion is summarized in the following result, 
whose proof is omitted. 

Theorem 5: Suppose that Assumptions Q] and [2] hold. 

1) Under either the Neyman-Pearson or Bayesian formulation, the optimal error exponents for 
the one-message sequential feedback are the same as that of the parallel configuration under 
either corresponding formulations. 

2) Under the Bayesian formulation, the optimal error exponent for the full feedback daisy 
chain is the same as that of the parallel configuration. In addition, if Assumption [3] holds, 
the Neyman-Pearson error exponent is the same as that of the parallel configuration. 

3) Furthermore, there is no loss in optimality if all sensors (except the last sensor in the one- 
message sequential feedback architecture) are constrained to using the same quantization 
function. 

B. Restricted Feedback Daisy Chain 

In this section, we consider the restricted feedback daisy chain (RFDC) architecture. Refer- 
ences 11341 . Il35l have shown that under the Neyman-Pearson formulation, feedback again does 
not improve the optimal error exponent. In this section, we consider the Bayesian formulation, 
and show that unlike the Neyman-Pearson formulation, feedback may improve the detection 
performance. We provide a characterization of the optimal error exponent in this case. 

We assume that lim^oo m/n = r G (0,1), otherwise the architecture is equivalent to a 
parallel configuration. Let E* dc be the optimal error exponent. For 7 G T, and j = 0, 1, let the 
Fenchel-Legendre transform of the log moment generating functions be 

A* (7, t) = sup {st - logEj [ e si°s*io(7(Xi))] I . 

These are also known as rate functions BUI . For i, j G {0, 1}, any for any given sequence of 
strategies for the first m sensors, let the rate of decay of the conditional probabilities be 

tij = — lim sup — log Fi(U = j). 

We collect the decay rates into a vector 

e = [e i,eio,e o,eii]. (16) 
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Lemma 5: Suppose Assumptions Q] and [2] hold. Suppose that the quantization functions for 
sensors 1, . . . , m in a RFDC have been fixed. Then, we have 

limsup — log P e {n) = —h(e) 

where e is as defined in ([Tot , 

r 



h(e) = min { (1 - r) sup A* 1 5°, 

I <50er V 1 _ r 



[eio - e 00 ) + re 00 , 



(1 -r) sup (V,--^— (e i - e u ) ) +re n ), (17) 

and P e (n) is the optimal probability of error under the given quantization functions for sensors 
1, . . . ,m. 

Proof: Let us fix a sequence of strategies that conform to the given quantization functions 
for sensors 1, . . . , m. Let a n and /3 n be the Type I and II error probabilities of a strategy with 
the fusion rule 



0, if4o } <0, 

1, if4?>o. 



From the Neyman-Pearson Lemma pT|. the optimal decision rule at the fusion center is the 
Neyman-Pearson test. Moreover, for any given fusion rule, either the Type I or II error probability 
is at least a n or fi n . Therefore, we have 

lim sup — log P e (n) > minjlim sup — log a n , lim sup — log /§„}. (18) 

n— >oo Tl n— >oo Tl n— ¥00 Th 

Thus it suffices to find a lower bound for the strategy using a zero threshold log likelihood 
ratio test as a fusion rule. Henceforth, we will assume that such a fusion rule is employed. 
Conditioning on the value of U, we have 

P^Yf = 0) = Pi(Y/ = | U = 0)Pi(£7 = 0)+ Pi (17 = I U = l)Pi(C7 = 1). 

Fix an e > 0. Let 5i(-,u) = 5f(-) G T be a function that depends on the value of u. Let 
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I = n — m. Using the lower bound in Cramer's Theorem (cf. B2l0 and Lemma @J we obtain 

-logPi(y f = I U = 0) 

n 

= ilogPi(4o ) A<0|^ = 0) 
n 

= 1 Pi (yE^M^M) < -ylog Pl(C/ = 0) 



n ° 1 I / ^ ° luy ftV ~ I °F (U = 0) 

v k=l 



1 



> 



> — sup A 5°, -- log - e + o(l), 

n S o er V ; P (C/ = 0) J 

where o(l) is a term that goes to zero as n — > oo. Taking n — >■ oo and then e — )■ 0, we obtain 
limsup-logPi(>7 = | U = 0) + limsup-logPi(C/ = 0) 

n—too n> n—^oo Tl 

> -(1 - r) sup A* ( 5°, (do - e 00 ) J - re 10 
<5o G r V 1 - r / 

= -(l-r)supA* (5°,-^— (eio-eoo) ) ~ re 00 (19) 
In the same way, it can be checked that 

limsup-logPitY/ = | U = 1) +limsup-logP 1 (f/ = 1) 

n— >oo n> n—^oo 71 



> -(1 -r) sup A* [S 1 ,--^— (eoi-eu) ) - 



reu, (20) 



and we obtain 



A similar proof shows that 



lim sup -log Pi (17 = 0) > -/i(e). 

n— >oo 71 



limsup-logP (y f = 1) > -h{e), 

n— >oo 71 



and that the optimal error exponent is lower bounded by —h(e). We note that this lower bound can 
be asymptotically achieved by letting all sensors in the second stage quantize their observations 
using 5° and 8 1 , depending on whether the feedback message is or 1 respectively, and where 
5° and 5 1 are chosen to asymptotically maximize their respective rate functions in (fT71) . The 
proof is now complete. ■ 
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Reference ||34ll shows that if e i > and e w > 0, then there is no loss in optimality if sensors 
within each stage are constrained to using the same quantization function. In the following, we 
show that it is optimal to require that e i > and e w > 0. We also provide a characterization 
of the optimal error exponent. 

Theorem 6: Suppose that Assumptions Q] and [2] hold. Then, the following statements hold for 
a RFDC architecture. 

(i) There is no loss in optimality if e i and e w are constrained to be strictly positive. 

(ii) There is no loss in optimality if sensors in the first stage are constrained to using the same 
quantization function. 

(iii) There is no loss in optimality if sensors in the second stage are constrained to using the 
same quantization function (which may depend on the feedback message). 

(iv) The optimal error exponent for the RFDC is 

£* c = -(l-r) rap m^AS^^A;^ (21) 

Proof: We first show claim (Q). Note that only one of e i and e o can be strictly positive. 
The same applies to eio and en. If eoi > and e w > 0, we have e o = en = 0, and (flTT) yields 

h(e) = (1 - r) min { sup A* ( 5, — ^— e 10 ) , sup A{ I 5, -—^—e 01 J } 
L ser V 1 - r / <5er \ 1 - r ) > 

> (1 -r) min {sup A* (5, 0), sup A* (5, 0)} 

= (1-r) sup A* (5, 0). 

ser 

Suppose that e o > and e 10 > 0. Then, e 01 = e n = 0, and from (TT71) , we have h(e) < 
(1 — r) sup,5 gr K\ (5, 0). The same argument applies for the case where e i > and e n > 0, and 
the case where all the decay rates are zero. Therefore, there is no loss in optimality if eoi and 
e w are constrained to be strictly positive. 

Claims dUl) and (jin)) follow from either an application of Cramer's Theorem (cf. 021) and 
(fT7T>, or from 1531 . 

Finally, we prove claim dTv]). Since there is no loss in optimality if all first stage sensors are 
restricted to some same quantization function 7 G T, the first stage Type I and II error decay 
rates are e i = Ag (7, t) and ei = A* (7, t) respectively, for some t (cf. Il2~6l ). Applying Lemma 
[5l and optimizing over 7 and t, we have shown that the optimal error exponent is lower bounded 
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by the right hand side of (12TT) . This bound is achievable, hence the claim follows. The proof is 
now complete. ■ 
Let 81 be the optimal error exponent of the daisy-chain if the second stage sensors ignore 
the feedback message. This is equivalent to a tree architecture with a height of two. Using the 
same arguments as above, it can be shown that 

S; = -(1 - r) sup min | A* U y-^A^ (7, t)\ , A{ U -j^K (7, *)) } • (22) 

Comparing (1271) and d22l) . we have £ dc < ££, i.e., the optimal error exponent for the RFDC 
is in general better than the tree configuration where feedback is absent. In the following, we 
provide a sufficient condition for no loss in performance when feedback is ignored, i.e., £ dc = ££. 
We also provide a numerical example in which £ dc < £ t *, i.e., feedback can strictly improve the 
asymptotic performance in some cases. 

Proposition 1: Suppose that there exists 5 G T such that 

g* = _(i _ r ) S up min |a* U ^— A* ( 7 , A , A* U -j^K (7, } , (23) 

and A* (6, t) = A* (5, -t) for all t. Then, 

S* dc = % = -(l- r) sup A* (5, -^—Al (7, 0) 

Therefore, there is no loss in optimality if the RFDC second stage sensors ignore the feedback 
message. 

Proof: To simplify the proof, we assume that 7 G T and t G t can be chosen so that the 
supremum in (|23l ) is achieved. To find the optimal threshold t, we set 



A o [6, —Al (7, t)J = A* {5, ( 7 , t) j . 

From the proposition hypothesis, we obtain A.*(y,t) = Aq (7, t), which implies that t — 0. 
Therefore, 

£* = -(l-r) sup A* (5, -^—Al (7, 0)) . (24) 

7,<5er \ 1-r J 

Suppose that there exists 5° ^ S 1 , and v 7^ such that 
min < Ar 



3 ( 5°, —Al (7, v) ) , At f 5\ (7, v) \ \ > Al ( 8, —Al (7, 0) j . (25) 
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If v > 0, Al (7, v) < Al (7, 0) since A* (7, •) is a decreasing function. Therefore, 

A o ( 5 °> T37 A ^ (7 ' 0) ) - A ° I37 A * (7 ' v) ) > A ° T37 A * (7 ' 0) ) ' 

a contradiction to (|24l) . A similar argument produces a contradiction if v < 0. Therefore, we 
must have v = 0. But this implies that (|25T) cannot hold. Hence, £| c = and the proposition 
is proved. ■ 
The following example shows that in some cases, the RFDC performs strictly better in the 
presence of feedback. 

Example 1: Let X k take values in the set {1,2,3}, and suppose that sensor messages are 
restricted to a single bit. Assume that the probability mass functions under the two hypotheses 
are as shown in Table |U We also let m = n/2, i.e., r = 1 — r = 1/2. 





1 


2 


3 




4/5 


3/20 


1/20 


Hi 


1/20 


3/20 


4/5 



TABLE I 

Probability mass functions for ExampleQ] 



Since £ 10 (X k ) is increasing with X k , the two possible 1-bit quantizers are 7i(X fe ) = iff 
Xk = 1, and 72(Afc) = iff X k E {1,2}. We optimize (|2T|) over these two quantizers and 
the threshold t. The results are shown in Figure [3j The optimal error exponent is found to be 
—0.5 ■ 0.73 = —0.365, and is achieved by having all second stage sensors use 7 2 if the feedback 
message is 0, and 71 if the feedback message is 1. On the other hand, if feedback is ignored, 
the optimal quantizer is 72, and the optimal error exponent is —0.356, which is strictly worse 
than that with feedback. 

In the following, we show that the RFDC performs strictly worse than a parallel configuration, 
and hence it has performance strictly inferior to a full feedback daisy-chain architecture. 

Proposition 2: Suppose that the supremum in (l21"l) is achieved. Then, the RFDC performs 
strictly worse than the parallel configuration with the same total number of sensors, i.e., El > 
^>^i P = -supi 6r A5 (5,0). 

Proof: Let r y,5°,5 1 ,t achieve the supremum in (|2"TI) . If A I (7, t) = 0, then from (l21"l) . we 
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Fig. 3. Plot of the rate functions for 71 and 72. The mark 'x' indicates the optimal error decay rate (up to a constant 1/2) 
when the feedback message (7 = 0, while '+' indicates the optimal error decay rate (up to a constant 1/2) when the feedback 
message U = 1. The optimal quantizers are achieved on rate functions belonging to different quantizers. 



have 

^>-(l-r)A* (8°, 0) 

> -supAg (5,0), 
ser 

since r > 0. A similar argument shows that E* dc > £* p if Ag (7, t) = 0. Therefore, in the 
following, we assume that A* (7, t) > for j = 0, 1. 
We have 



- r ) A o y-^A* 



(7,*) 

= (1 - r) ^ (5°, (7, t) J + rAt (7, f) 

<(l-r)A* (5°,0)+rAt ( 7 ,t) 

< (l-r)su P At(5,0)+rAt( 7 ,0, (26) 
<5er 

where the penultimate inequality follows from A* (<5°, •) being a decreasing function, and A^ (7, t) > 
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0. Similarly, 

< (l-r)su P A*(5,0) + rA;( 7 ,t). (27) 
Combining ([26]) and ([27]), and since A* (5, 0) = A* (5, 0) for all 5 e T, we obtain 

^ > -(1 - r) sup A* (5, 0) - r min {A^ (7, t), A* (7, *)} 

> -(1-r) sup Aq («J,0)-rAS (7,0) 

>-su P AS &0)=£* lp . 
<5er 

The proof is now complete. 

■ 

V. Conclusion 

We have studied two-message feedback architectures, in which each sensor has access to 
compressed summaries of some or all other sensors' first messages to the fusion center. In the 
sequential feedback architecture, each sensor has access to the first messages of those sensors that 
communicate with the fusion center before it. In the restricted and full feedback architectures, 
each sensor has partial and full information respectively, about the first messages of every other 
sensor. Under both the Neyman-Pearson and Bayesian formulations, we show that the optimal 
error exponent is not improved by the feedback messages. We have also studied the one-message 
feedback architectures in which a group of sensors have access to information from sensors in 
a first group. We show that if the fusion center has knowledge of all the messages from the 
sensors in the first group, then feedback does not improve the optimal error exponent, which 
is the same as the parallel configuration. In the case where the fusion center has only limited 
knowledge (a 1-bit summary) of the messages, feedback can improve the optimal error exponent, 
but the optimal error exponent is strictly worse than that of the parallel configuration. Our results 
suggest that in the regime of a large number of sensors, and where the fusion center has sufficient 
memory, the performance gain in binary hypothesis testing due to feedback does not justify the 
increase in communication and computation costs incurred in a feedback architecture. 
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In the two-message feedback architecture, we assumed that the fusion center has unlimited 
memory and remembers all the first messages. The case where the fusion center retains only 
a finitely-valued summary of the first messages has been studied in ||33ll , but under various 
assumptions including finitely-value observation spaces, sensors all using the same quantization 
functions and constraints on the feedback messages. Reference ll33l shows that feedback does 
not improve the error exponent. The same problem in the general setting that we have considered 
in this paper remains open. 

In the case of Bayesian M-ary hypothesis testing, where M > 2, we conjecture that feedback 
improves the optimal error exponent. Characterizing the optimal feedback strategy and error 
exponent is part of future work. This research is also part of our ongoing efforts to quantify 
the performance of various network architectures. Future research directions include studying 
network architectures with more general loop structures. 
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Appendix A 
Mathematical Preliminaries 

In this appendix, we collect two well known results that are useful in our proofs. The first 
result is an elementary fact, which is an application of Jensen's inequality. A proof can be found 
in ll39l . and is omitted here. 

Proposition A.l: Suppose : (0, oo) R is a convex function. Then for any function 7, we 
have 

%[0(^( 7 (X)))]<%[0(^(X))]. 

The following lower bound for the maximum of the Type I and II error probabilities was first 
proved in ll43l for the case of discrete observation spaces. The following proposition generalizes 
the result to a general observation space. The proof is identical to that in [|43l . with some notation 
changes, and is provided for completeness. 
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Proposition A.2: Consider a hypothesis testing problem based on an observation X with 
distribution Fj under hypothesis Hj, j — 0,1. Suppose that the measures P and Pi are absolutely 
continuous w.r.t. each other. Let P e j be the probability of error when Hj is true. Let Z = 
log^(X) be the log Radon-Nikodym derivative. For any sei, let A(s) = logE [exp(sZ)] 
be the log-moment generating function of Z. Then, for s* E [0, 1] such that A'(s*) = 0, we have 



max(P e , , P e>1 ) > ^exp (a{s*) - V / 2A"0 



Proof: The proof steps are identical to that of Theorem 5 in [|43l . Let Pj be the probability 
measure of Z under hypothesis Hj, j = 0, 1. For s E (0, 1), define the probability measure Q 
such that 

and let Eg and var Q be the mathematical expectation and variance w.r.t. Q, respectively. Let 
Y be a random variable with distribution Q. Then, it is easy to check that Eq[F] = A'(s) and 



var Q (y) = A"(s). Let A s = {y : \y - A'(s)| < y/2A"(s)}. From Chebychev's inequality, we 
have 

Q(A S ) > l -. (28) 

For any measurable set A, we have 

P (A) = E Q [exp{-sZ + A(s))l {ZeAy ] 

> E Q [exp(-sZ + A(s))l {Ze AnA s }] 

> exp (A(s) - sA'(s) - s v / 2A"(s)) Q(A n A,). 

Similarly, we have 



PM") > exp (A(s) + (1 - s)A'(s) - (1 - s) v/2A^)) g(A c n A s ). 
From (|28l> . either Q(A (1 A 8 ) > 1/4 or Q(A C n A s ) > 1/4. Therefore, we have either 

P (A) > ^exp (A(s) - sA'( S ) - sy/2A"{s)), (29) 



or 



Px(A c ) > iexp (A(s) + (1 - s)A'(a) - (1 - S ) V / 2A^)). (30) 
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Since A(s) is convex with A(0) = A(l) = 0, there exists s* G (0, 1) such that A(s*) = 0. 
Substituting this into (|29l) and (l30l) . we obtain 

max(P (yl),P 1 (A c )) > ^exp (A(a*) - y/2A»{s*)). 

The proof is now complete. ■ 
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